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@) Seed specific transcriptional regulation. 

@ Nucleic acid sequences and methods for their use are 
provided which provide for seed specific transcription, In order 
to modulate or modify expression In seed, particularly embryo 
celts. Transcriptional initiation regions are Identified and 
isolated from plant cells and used to prepare expression 
cassettes which may then be transformed into plant cells for 
seed specific transcription. The method finds particular use In 
conjunction with modifying fatty acid production In seed tissue 
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Description 

SEED SPECIFIC TRANSCRIPTIONAL REGULATION 



INTRQDUCTION 

5 

Technical Field 

Genetic modrfication of plant material is provided for seed specific transcription. Production of endogenous 
products may be modulated or new capabilities provided. 

10 Background 

The primary emphasis In genetic modification has been directed to prokaryotes and mammalian cells. For a 
variety of reasons plants have proven more intransigent than other eukaryotic cells in the ability to genetically 
manipulate the plants. In part, this has been the result of the different goals involved, since for the most part 
plant modification has been directed to modifying the entire plant or a particular plant part in a live plant, as 

15 distinct from modifying cells In culture. 

For many applications, It will be desirable to provide for transcription in a particular plant part or at a 
particular time In the growth cycle of the plant. Toward this end, there Is a substantial interest In Identifying 
endogenous plant products whose transcription or expression are regulated In a manner of interest. In 
Identifying such products, one must first look for products which appear at a particular time In the cell growth 

20 cycle or In a particular plant part, demonstrate its absence at other times or In other parts, Identify nucleic acid 
sequences associated with the product and then identify the sequence in the genome of the plant In order to 
obtain the 5'-untranslated sequence associated with transcription. This requires substantial Investigation in 
first identifying the particular sequence, followed by establishing that It Is the coaect sequence and Isolating 
the desired transcriptional regulatory region. One must then prepare appropriate constructs, followed by 

25 demonstration that the constructs are efficacious in the desired manner. 

Identifying such sequences is a challenging project, subject to substantiar pitfalls and uncertainty. There is, 
however, substantial Interest In being able to genetically modify plants, which justifies the substantial 
expenditures and efforts in identifying transcriptional sequences and manipulating them to determine their 
utility. 

30 

Relevant Literature 

Crouch et al., In: Molecular Form and Function of the Plant Genome , eds van VIoten-Doting, Groot and Hall. 
Plenum Publishing Corp. 1985. pp 555-566; Crouch and Sussex, Planta (1981) 153:64-74; Crouch et al., J. MoL 
Appl. Genet. (1983) 2:273-283; and Simon et al.. Plant Molecular Biology (1985) 5: 191-201. descTTbe various 
35 aspects of Brassica napus storage proteins. Beachy et al.. EMBOJ. (1985) 4:3047-3053; Sengupta-Gopalan et 
al.. Proc. Natl. Acad. ScL USA (1985) 82:3320-3324; Greenwood and Chrlspeels. Plant Physiol. (1985) 79:65-7? 
and Chen et al.. Proc. Natl. Acad. ScL USA (1986) 83:8560-8564 describe studies concerned with seedTtorage 
proteins and genetic manipulation. Eckes et al.. Mol. Gen. Genet. (1986) 205:14-22 and Fluhr et al., Science 
(1986) 232:1106-1112 describe the genetic manipulation of light Inducible plant genes. 

40 

SUMMARY OF THE INVENTION 

DNA constructs are provided which are employed in manipulating plant cells to provide for seed-specrfic 
transcription. Partlculariy. storage protein transcriptional regions are joined to other than the wild-type gene 
and introduced into plant genomes to provide for seed-specific transcription. The constructs provide for 
45 modulation of endogenous products as well as production of heterologous products. 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

Novel DNA constructs are provided which allow for modification of transcription in seed, particularly In 
embryos during seed maturation. The DNA constructs comprise a regulated transcriptional initiation region 

50 associated with seed formation, preferably in association with embryogenesis and seed maturation. Of 
particular Interest are those transcriptional initiation regions associated with storage proteins, such as napln. 
cruciferin. p-conglycinin. phaseolin, or the like. The transcriptional initiation regions may be obtained from any 
convenient host, particularly plant hosts such as Brassica . e.g. napus or campestris . soybean (Glycine max ) , 
bean ( Phaseolus vulgaris) , com ( Zea mays) , cotton (Gossypium sp.), safflower ( Carthamus tinctorius ), to"mito 

55 (Lycopersican esculentum), and Cuphea species. 

Downstream from and under the transcriptional initiation regulation of the seed specific region will be a 
sequence of interest which will provide for modification of the phenotype of the seed, by modulating the 
production of an endogenous product, as to amount, relative distribution, r the like, or production of a 
heterologous expression product to provide for a novel function or product in the seed. The DNA construct will 

60 also provide for a termination region, so as t provid an xpression cassette Into which a gene may be 
Introduced. Conveniently, transcriptional Initiation and temilnatlon regions may be provided separated In the 
direction of transcription by a linker or polyllnker having on or a plurality of restriction sites for insertion of the 
gene to be under the transcriptional regulation of the regulatory regions. Usually, the linker will have from 1 to 
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10. more usually from about 1 to 8. preferably from about 2 to 6 restriction sites. Generally, th linker will be 
fewer than 100 bp, frequently fewer than 60 bp and generally at least about 5 bp. ' ' 

The transcriptional initiation region may b native or homologous to the host or foreign or heterologous to 
the host. By foreign is intended that the transcriptional Initiation region Is not found In the wild-type host Into ' 
which the transcriptional initiation region is Introduc d. 5 

Transcriptional initiation regions of particular interest are those associated with the Brassica napus or 
campestris napin genes, acyl carrier prot ins. genes that express from about day 7 td day 40 jn se^d, 
particularty having maximum expression from about day 10 to about day 20, where the expressisd gene Is not" 
found In leaves, white the expressed product is found In seed in high abundance. 

The transcriptional cassette will include in the 5'-3* direction of transcription, a transcriptional and iH) 
translationai Initiation region, a sequence of Interest, and a transcriptional and translatlonal termination regbn 
functional in plants. One or more Introns may also be present. The DNA sequence may have any open readbig 
frame encoding a peptide of interest, e.g. an ertzyme, or a sequence complenltiitteuy to ft genomic sequence,; 
where the genomic sequence may be an open reading frame, an intron, a nori-codrng leader sequence, or any 
other sequence where the complementary secjuence will Inhibit transcription, messenSger RNA processing, is 
e.g. splicing, or translation. The DNA sequence of interest may be synthetic, naturally dertved, or 'combinations * 
thereof. Depending upon the nature of the DNA sequence of lnterest.'lt may be desirable to synthesize the 
sequence with plant preferred codons. The plant pretended codons may be drtBrriSn^iifrbm the cbdonS Of 
highest frequency In the proteins expressed in the largest amount In the particular pliftt sipecles of Ihtefefit. ' < 

in preparing the transcription cassette, the various DNA ffagments may be n^artlpyatey."s^ . ^ 

the DNA sequences In the pnoper orientation and. as appropriate. Fn the proper re&dhg frame. Toward this ' 
end. adapters or linkers may be employed for Joining the DNA fragments or btheV rffen^trfations may be 
involved to provide for convenient restriction sites, removal of superfluous DNA. remoVal of restriction sites, or 
the like. For this purpose, in vitro mutagenesis, primer repair, restriction, einheallng. resection, ligation, or the 
like may be employed, where Insertions, deletions or substitutions, e.g. transittons and transversions, may be 25 
Involved. " 

The termination region which Is employed will be primarily one of convisnlence, islnee the termination regions 
appear to be relatively interchangeable. The termination region may be native with ftie triinacrlptlonaJ initiation 
region, may be native with the DNA sequence of Interest, or may be derived Wotfi another source. Cbrtvehfent 
termination regions are available from the Ti-plasmid of A. tumefaclens , such th6 od:op1ne syfithase gfed 3d 
nopallne synthase termination regions. ' > ; 

By appropriate manipulations, such as restriction, chewing back or ftlHng in oVerNangS tb provide blunt 
ends, ligation of linkers, or the like, complementary ends of the fragments can be prdvlded for Joining md 
ligation. 

in carrying out the various steps, cloning is employed, so as to amplify the amount of DNA and to iallow for '35 
analyzing the DNA to ensure that the operations have occurred In a proper manner. A wide Variety of cloning 
vectors are available, where the cloning vector Includes a replication system functional th E. coll and a marfc&r 
which allows for selection of the transformed ceHs. Illustrative vectors Include pBR332, pUC series, MISpip . 
series, pACYC184, etc. Thus, the sequence m^ be Inserted Into the vector at an alppropriate' restrictfon 
site(s), the resulting plasmid used to transfonm the E. coll host, the E. coil growh fin an afjprojDriate nutrient ^ 
medium and the cells harvested and iysed and the plasmid recovered. Analysis may fnvolve sdtfuence analysis, 
restriction analysis, electrophoresis, or the like. After each manipulation the DNA^aquence to be Used In the , 
final construct may be restricted and joined to the next sequence, where each of the partial cdnstructs may be .. ^ 

cloned in the same or different plasmlds. ^ 

In addition to the transcription construct, depending upon the manner of introductlph of the transcriptfon ' 43 . 
construct into the plant, other DNA sequences may be required. For example, when using the Tl- or Rl-plasmid * 
for transformation of plant ceils, as described below, at least the right border and frequently both the right a 
left borders of the T-DNA of the Ti-and Ri-plasmlds will be joined as flanking regions to the transcription 
construct. The use of T-DNA for transfomiation of plant cells has received ektehslvfe study and is amply 
described In EPA Serial No. 120,516, Hoekema, In: The Binary Plant Vector System CjffsetKirukkerij Kantisrs 50 
B.V.. Alblasserdam. 1985, Chapter V. Fraley, et al.. Crit. Rev. Plant Scl. . 4:1-46. and An et al.. EMBO Jt. (1985) 
4:277-284. " " y - 

" Alternatively, to enhance integration into the plant genome, terminal repeats of transposoris may be used as 
borders In conjunction with a transposase. In this situation, expression of the trahsposase should be 
Inducible, or the transposase inactivated, so that once the transcription construct Id Integrated Into the 55 
genome, it should be relatively stably Integrated and avoid hopping. 

The transcription construct will normally be Joined to a marker for selection In plant cells. Conveniently, the 
marker may be resistance to a biocide. particularly an antibiotic, such as kanamycln. G41 8. bleomycin, 
hygromycin, chloramphenicol, or the like. The particular marker employed will be one which will allow for 
selection of transformed cells as compared to cells lacking thb DNA which has befen Ihtroduced. " 60 

A variety of techniques are available for the Introduction of CNA into a plant cell h'ijst. These techniques 
include transformation with Ti-DNA employing A. tumefaciens or A. rhizogenes aa th^ tr6risforminQ agent, 
protoplast fusion, injection, elecfroporation. etc. For transfomiation With Agrobacferlutn , pMshilds can be 
prepared in E. coil which plasmids contain DNA homologous with the Tl-plasmld. partl'cuiarly T-DNA. The 
plasmid may or may not be capable of replication In Agrobacterium , that is, It may or may hot have a broad 65. 
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spectrum prokaryotic replication system, e.g. RK290, depending in part upon whether the transcription 
construct is to be integrat d into the Tl-piasmid or be retained n an Ind pendent plasmid. By means of a 
heiper plasmid, the transcription construct may be transferred to th A. tumefaciens and the resulting 
transformed organism used for transforming plant cells. 
5 Conveni ntly, expiants may be cultivated with the A. tum faclens or A. rhizogenes to allow for transfer of the 
transcription construct to th plant cells, the plant cells dispersed in an appropriate s 1 ctiv medium f r 
selection, grown to callus, shoots grown and plantlets r generated from the shoots by growing in rooting 
medium. The Agrobacterium host will contain a plasmid having the yir genes necessary for transfer of the 
T-DNA to the plant ceils and may or may not have T-DNA. For injection and electroporation. disarmed 

10 Ti-plasmids (lacking the tumor genes, particulariy the T-DNA region) may be introduced into the plant cell. 
The constructs may be used in a variety of ways. Particularly, the constructs may be used to modiiy the fatty 
acid composition in seeds, that fs changing the ratio and/or amounts of the various fatty acids, as to length, 
unsaturation. or the like. Thus, the fatty acid composition may be varied, enhancing the fatty acids of from 10 to 
14 carbon atoms as compared to the fatty acids of from 16 to 18 cariDon atoms, increasing or decreasing fatty 

15 acids of from 20 to 24 carbon atoms, providing for an enhanced proportion of fatty acids which are saturated or 
unsaturated, or the like. These results can be achieved by providing for reduction of expression of one or more 
endogenous products, particulariy enzymes or cofactors. by producing a transcription product which is 
complementary to the transcription product of a native gene, so as to inhibit the maturation and/or expression 
of the transcription product, or providing for expression of a gene, either endogenous or exogenous. 

20 associated with fatty acid synthesis. Expression products associated with fatty acid synthesis include acyl 
carrier protein, thioesterase, acetyl transacylase, acetyl-coA carboxylasem, ketoacyl-synthases. malonyl 
transacylase, stearoyl-ACP desaturase, and other desaturase enzymes. 

Alternatively, one may wish to provide various products from other sources including mammals, such as 
blood factors, iymphoklnes, colony stimulating factors, Interferons, plasminogen activators, enzymes, e.g. 

25 superoxide dismutase. chymosin, etc.. hormones, rat mammary thioesterase 2, phospholipid acyl desaturases 
involved in the synthesis of cicosapentaenoia acid, human serum albumin. Another purpose is to increase the 
level of seed proteins, particularly mutated seed proteins, having an improved amino acid distribution which 
would be better suited to the nutrient value of the seed. In this situation, one might provide for inhibition of the 
native seed protein by producing a complementary DNA sequence to the native coding region or non-coding 

SO region, where the complementary sequence would not efficiently hybridize to the mutated sequence, or 
inactivate the native transcriptional capability. 

The cells which have been transformed may be grown into plants in accordance with conventional ways. 
See, for example, McCormick et al.. Plant Cell Reports (1986) 5:81-84, These plants may then be grown, and 
either pollinated with the same transformed strain or drfferent^strains, Identifying the resulting hybrid having 

35 the desired phenotypic characteristic. Two or more generations may be grown to ensure that the subject 
phenotypic characteristic is stably maintained and inherited and then seeds harvested to ensure the desired 
phenotype or other property has been achieved. 

As a host cell, any plant variety may be employed which provides a seed of interest. Thus, for the most part, 
plants will be chosen where the seed Is produced in high amounts or a seed specific product of Interest Is 

40 involved. Seeds of interest include the oil seeds, such as the BrassIca seeds, cotton seeds, soybean, 
safflower, sunflower, or the like; grain seeds, e.g. wheat, bariey, rice, clover, com, or the like. 

Identifying useful transcriptional initiation regions may be achieved in a number of ways. Where the seed 
protein has been or is Isolated, It may be partially sequenced, so that a probe may be designed for Identifying 
messenger RNA specific for seed. To further enhance the concentration of the messenger RNA speciflcaily 

45 associated with seed, cDNA may be prepared and the cDNA subtracted with messenger RNA or cDNA from 
non-seed associated cells. The residual cDNA may then be used for probing the genome for complementary 
sequences, using an appropriate library prepared from plant cells. Sequences which hybridize to the cDNA 
may then be Isolated, manipulated, and the 5'-untranslated region associated with the coding region Isolated 
and used In expression constructs to Identify the transcriptional activity of the 5'-untranslated region. 

50 in some Instances, the research effort may be further shortened by employing a probe directly for screening 
a genomic library and identifying sequences which hybridize to the probe. The sequences will be manipulated 
as described above to Identify 5'-untransIated region. 

The expression constructs which are prepared employing the 5'-untransIated regions may be transformed 
Into plant cells as described previously for determination of their ability to function with a heterologous 

55 structural gene (other than the wild-type open reading frame associated with the 5'-untranslated region) and 
the seed specificity. In this manner, specific sequences may be identified for use with sequences for seed 
specific transcription. Expression cassettes of particular interest include transcriptional initiation regions from 
napin genes, particularly Brassica napin genes, more particulariy BrassIca napus or Brassica campestris 
genes, regulating structural genes associated with lipid production, particularly fatty acid production. Including 

60 acyl carrier proteins, which may be endogenous or exogenous to the particular plant, such as spinach acyl 
carrier protein. Brassica acyl carrier protein, acyl carrier protein, either napus or campestris , Cuphea acyl 
carrier pr teln., acetyl transacylase, malonyl transacylase. p-ketoacyl synthases I and II, thioesteras , 
particularly thio esterase II. from plant, mammalian, or bacterial sources, for example rat thioesterase II, acyl 
ACP. or phosph lipid acyl desaturases. 

65 The following examples are offered by way of illustration and not by way of limitation. 
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EXPERIMENTAL 
Materials and M thods 

5 

Cloning Vectors 

Cloning vectors used include the pUC vectors, pUC8 and pUC9 (Vieira and Messing. Gene (1982) 
19:259-268); pUC18 and pUC19 (Norranderet aJ., Gene (1983) 26:101-106; Yanlsch-Perron et ai.. Gene (1985) 
33:103-119), and analogous vectors exchanging chloramphenicol resistance (CAM) as a marker for the 
ampicillin resistance of the pUC plasmids described above (pUC-CAM [pUC12-Cm, pUC13-Cm] Buckley, p.. lo 
Ph.D. Thesis, U.C.S.D.. CA 1985). The multiple cloning sites of pUC18 and pUC19 vectoris were exchanged vvfth 
those of pUC-CAM to create pCGN565 and pCGN566 which are CAM resistant. Also used were pUC118 €tfid 
pUC119. which are respectively, pUC18 and pUC19 with the intergenic region of M13, from an HglAI site at 
5465 to the Ahaill site at 5941, inserted at the Nde l site of pUC. {Available from Vlelra J. and Messing. J. 
Waksman Institute, Rutgers University, Rutgers. N.J.) /5 

Materials ■ ' 

Terminal deoxynucieotide transferase (TDT), RNaseH. E. coil DNA polymerase. T4 kinase, and restriction 
enzymes were obtained from Bethesda Research Laboratories; E. col! DNA llgase was obtained from Neiw 
England Biolabs; reverse transcriptase was obtained from Life Sciences, Inc.; Isotopes were obtained from ^ 
Amersham; X-gal was obtained from Bachem, inc. Torrance. CA. 

Example 1 

Construction of a Napin Promoter .25 

There are 298 nuclotides upstream of the ATG start codon of the napin gene on the pgNI clone, a 3.3 kb 
EcoRI fragment of B. napus genomic DNA containing a napin gene cloned into pUC§ (available from Marti 
Crouch. University of Indiana). pgN1 DNA was digested with EcoRI and SstI and legated to EcoRI/Sstl digested 
pCGN706. (pCGN706 Is an Xhol/PstI fragment containing 3' and poiyad^nylation sequences of another napin 
cDNA clone pN2 (Crouch et a!.. 1 983 supra ) cloned in peGN566 at the Sal! and f^l sites.) The resulting clone 50 
pCGN707 was digested with Sail and treated with the enzyme Bal31 to remove some df th^ coding region of 
the napin gene. The resulting resected DNA was digested with Smal after the BalSI treatment and rellgated. 
One of the clones. pCGN713. selected by size, was subcloned by Eco Ri aod Banripl digestion into both 
EcoRl/BamHI digested pEMBL18 (Dente et al.. Nucleic Acids Res> (1983) 11:1645^1^) and pUC118 to give 
E418 and E4118 respectively. The extent of Bal31 digestion was confpnned by S&nger dldeoxy sequencing of 35 
E418 template. The Bal31 deletion of the promoter region extended only to 57 nucleotide? downstream of the 
start codon, thus containing the 5' end of the napin coding sequence and about 300 bf^'of the 5' non-coding 
region. E41 1 8 was tailored to delete all of the coding region of napin including the ATG start codon by in yjtrg 
mutagenesis by the method of Zolier and Smitii ( Nucleic Acids l^es. (1982) 16:6487^6500) using an 
oligonucleotide primer 5'-GATGTrrTGTATGTGG(lcCCCTAGGAGATC-3'. Screening^ for the appropriate 40 
mutant was done by two transformations Into E. coil strain JM83 (Messing J.. In : Recombinant DNA Technical 
Buiietin, NIH Publication No. 79-99. 2 No. 2. 1979, pp 43-48) and Smal digestion of putative transformants. The 
resulting napin promoter clone is pCGN778 and contains 298 nucleotides from the Eco RI site of pgf4i to the A ^ 
nucleotide just before the ATG start codon of napin. The promoter region was subcloned. into a 
chloramphenicol resistant background by digestion with EcoRI and BamHI and ligation to Eco RI/ Bam HI .45 
digested pCGN565 to give pCGN779c. .1 

Extension of the Napin Promoter Clone 

pCGN779c contains only 298 nucleotides of potential 5'-regulatory sequence. The napin promoter was 
extended with a 1.8 kb fragment found upstream of the 5'-EcoRI site on the original XBnNa clone. The -3.5 kb 50 
Xhoi fragment of XBnNa (available from M. Crouch), which Includes the napin region, was subcloned Into 
Sall-digested pUC119 to give pCGN930. A Hindiil site close to a 5' Xho ! site was used to subclone the 
Hindlll/EcoRI fragment of pCGN930 into HIndlil/EcoRI digested Bluescript + (Vector Cloning Systems, San 
Diego, CA) to give pCGN942. An extended napin promoter was made by ilgatfng pCGN779c digested with 
Eco RI and PstI and pCGN942 digested with Eco RI and PstI to make pCGN943. This promoter contains —2.1 55 
kb of sequence upstream of the original ATG of the napin gene contained on XBnNa. A partial sequence of the 
promoter region is shown in Figure 1. 

Napin Cassettes 

The extended napin promoter and a napin 3'-regulatory region Is combined to make a napin cassette for 60 
expressing genes s ed-sp crfically. The napin 3-regIon used Is from the piasnnlid pCGNlS24 containing the 
Xhol/EcoRI fragment from pgNI ( Xh I sit is located 18 nucleotides from the stop codon of the riapin gene) 
subcloned into EcoRI/Sail digested pCGN565. Hindlll/PstI digested pCGN94a and pCiSN1924 are ligated to 
make the napin cassette pCGN744. with unique cloning sttes Smal. Sail. and. PstI for inserting genes. 
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C nstruction of cDNA Library from Spinach Leaves 

■■ Total RNA was extracted from young spinach leaves in 41^ guanidine thiocyanate buffer as described by 
Facciotti et al. (Biotechnology (1985) 3:241-246). Total RNA was subjected to oligo{dT)-ceilulose column 
chromatography two times to yieid polyrA)+ RNA as describ d by Maniatis et al.. (1982) Molecular Cloning: A 

5 Laborat ry Manual. Cold Spring Harbor Uboratory. New YorI<. A cDNA library was constructed in pUC13-Cm 
aaccording to the m ethod of Gubl r and Hoffman. (Gen (1983) 25:263-269) with slight modifications. RNasin 
was omitted in the synthesis of first strand cDNA as it interfered with second strand synthesis if not completely 
removed and dCTP was used to tail the vector I3NA and dGTP to tali d ubie-strand d cDNA instead of the 
reverse as described in the paper. The annealed cDNA was transfonned to competent E. coli JM83 (Messing 

W (1979) supra) cells according to Hanahan ( J. Mol. Biol. (1983) 168:557-580) and spread onto LB agar plates 
(Miller (1972) Experiments in Molecular Genetics. Cold Spring Harbor Laboratory. Cold Spring Harbor, New 
York) containing 50 p.g/ml chloramphenicol and O.OO50/0 X-Gal. 
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Identification of Spinach ACP-I cDNA 

A total of approximately 8000 cDNA clones were screened by performing Southern blots (Southern, J. Mol. 
Biol. (1975) 98:503) and dot blot (described below) hybridizations with clone analysis DNA from 40 pools 
representingToo cDNA clones each (see below). A 5' end labeled synthetic oligonucleotide (ACPP4) that is at 
least 660/0 homologous with a 16 amino acid region of spinach ACP-I (5'-GATGTCTTGAGCCTTGTCCTCATC- 
CACA7TGATACCAAACTCCTCCTC-3') is the complement to a DNA sequence that could encode the 16 amino 
acid peptide glu-glu-glu-phe-giy-ile-asn-val-asp-glu-asp-lys-ala-gln-asp-ile. residues 49-64 of spinach ACP-I 
(Kuo and Ohlrogge. Arch. Biochem. Biophys. (1984) ^:290-296) and was used for an ACP probe. 

Clone analysis DNA for Southern and dot blot hybridizations was prepared as follows. Transfomiants were 
transferred from agar plates to LB containing 50 ^g/mi chloramphenicol In groups of ten clones per 10 ml 
media. Cultures were incubated overnight in a 37* C shaking incubator and then diluted with an equal volume of 
media and allowed to grow for 5 more hours. Pools of 200 cDNA clones each were obtained by mixing 
contents of 20 samples. DNA was extracted from these cells as described by Birnbolm and Doly (Nucleic 
Acids Res. (1979) 7:1513-1523). DNA was purified to enable digestion with restriction enzymes by extractions 
with phenol and chloroform followed by ethanol precipitation. DNA was resuspended in sterile, distilled water 
and 1 ug of each of the 40 pooled DNA samples was digested with EcoRI and HIndlll and electrophoresed 
through O.70/0 agarose gels. DNA was transfen-ed to nitrocellulose filters following the blot hybridization 

technique of Southern. . . ^ xi 

ACPP4 was 5' end-iabeled using y-^^P dATP and T4 kinase according to the manufacturer s specifications. 
Nitrocellulose filters from Southern blot transfer of clone analysis DNA were hybridized (24 hours. 42** C) and 
washed according to Berent et al. (BioTechniques (1985) 3:208-220). Dot blots of the same set of DNA pools 
were prepared by applying 1 ^tg of each DNA pool to nylon membrane filters in 0.5 M NaOH. These blots were 
hybridized with the probe for 24 hours at 42** C in 500/o formamide/1o/o SDS/1 M NaCL. and washed at room 
temperature in 2X SSC/O.IO/0 SDS (1X SSC » 0.15M NaCI; 0.015M Na citrate; SDS-sodlum dodecylsulfate). 
DNA from the pool which was hybridized by the ACPP4 oligoprobe was transformed to JM83 cells and plated 
as above to yield individual transfonmants. Dot blots of these Individual cDNA clones were prepared by 
applying DNA to nitrocellulose filters which were hybridized with the ACPP4 oligonucleotide probe and 
analyzed using the same conditions as for the Southern blots of pooled DNA samples. 

Nucleotide Sequence Analysis 

The positive clone. pCGNISOU was analyzed by digestion with restriction enzymes and the following partial 
map was obtained. 

pUC13-Cni I-35-I 248 |-63-| 152 | -200 [ 
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**polylinker with available restriction sites indicated 

The cDNA clone was subcloned Into pUC118 and pUC119 using standard laboratory techniques of 
restriction, ligation, transformation, and analysis (Maniatis etal.. (1982) supra ). Single-stranded DNA template 
was prepared and DNA sequence was determined using the Sanger dldeoxy technique (Sanger etal.. (1977) 
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Proc. Nat. Acad. Sci. USA 74:5463-5467). Sequenc analysis was performed using a sofftware package from 
Intelli-Qenetics, inc. 

pCGNISOL contains an (approximately) 700 bp cONA insert including a stretcli of A residues at the 3' 
temiinus which represents the poIy(A) taii of the mRNfA. An ATQ codoh at position 61 Is presumed to encode 
the MET translation initiation codon. This codon is the start of a 41 1 nucleotide open redding frame, of which, 5 
nucleotides 229-471 encode a protein whose amino acid sequence corresponds kimost perfectly with the pub 
llshed amino acid sequence of ACP-I of Kuo and Ohlrogge supra as described previously. Inr addition to mature 
protein, the pCGNISOL also encodes a 56 residue' transit peptide sequence, as might be' expected for a 
nuclear-encoded chloroplast protein. 

• . Id 

Napin - ACP Construct 

pCGN796 was constructed by ligating pCGNISOL digested with HIndlll/ Bam HI, pUC8 digested with Hindill 
and Bam HI and pUC118 digested with Bam Hl. The ACP gene from pCGN796 was transferred into a 
chloramphenicol background by digestion with Bam Hl and ligation with Bam HI digested pCGN565. The 
resulting pCGN1902 was digested with EcoRI and Smal and iigated to EcoRI/ Sma l digested pUC118 to give is 
pCGN1920, The ACP gene in pCGN1920 was digested at thd Ncol site, filled in by treatment with the Klenow 
fragment, digested with Smal and reiigated to form pOGN1919. This eliminated the 5'-codlng sequences fr6m 
the ACP gene and regenerated the ATG. TTils ACP gene was flanked with PstI sites by digesting pCGN19t9 
with EcoRI, filling in the site with the Klenow fragment and ligating a PstI linker, this clone Is called;pCGN945. 

The ACP gene of pCiGN945 was moved as a Bam Hl/Pstl fragment to pUC1 18 digested with Bam HI and PstI 20 
to create pCGN946a so that a Smal site (provided by the pUC1 18) would be at the 5'^nd of the ACP- 
sequences to facilitate cloning Into the napin cassette pCGN944. pCGN946a digested with Sma l and PstI was 
iigated to pCGN944 digested with Smal and Psti to produce the napIn ACP cassette p€GN946. The napin ACP 
cassette was then transfen-ed into the binary vector ■pCGN783 by cloning from the Hindill site to produce 
pCGN948. 25 

Construction of the Binary Vector pCGN783 

pCGN783 is a binary piasmid containing the left and right T-DNA borders of A. tumefaciens (Barker et al., 
Plant Moi. Biol. (1983) 2:335-350); the gentamlcin resistance gene of pPHIJI (HIrsch et ai.; Piasmid (1984), 
12:139-141) the 35S promoter of cauliflower mosaic virus (CaMV) (Gardner et aJ., Nucietc Acids Res, (1981) 50 
9:2871-2890). the kanamycin resistance gene of Tn5 (Jorgenson et al., Infra and Wolff et al„ M6 (1985) 
13:355-367) and the 3' region from transcript 7 of pTjA6 (Barker et al;, supra (1983J). 

To obtain the gentamlcin resistance marker, the gentamlcin resistarice gene was Isolated from a 3.1 kb 
EcoRI-PstI fragment of pPHUI and cloned into pUC9 yielding pCGN649. The Htndlli -Bam HJ fragment 
containing the gentamlcin resistance gene was substituted for the Hindlll-BglH fragment of pCGN587 creatta^f 35' 
PCGN594. 

PCGN587 was prepared as follows: The Hindlil -Sma l fragment of Tn5 containing the entire structural gene 
for APH il (Jorgenson et aj.. Moi. Gen. Genet (1979) 177 :65) was cloned Into pUCS (Vleira and Iwlessing, Gene 
(1982) 19:259). converting the fragment into a Hindlil-EcoRI fragment, since there is an EcoRI site immediately 
adjacent to the Sma l site. The Pstl-EcoRI fragment containing the 3'^portion of the APHll gene was then 40 
combined with an EcoRI -Bam HI-Sali-PstI linker into the Eco RI site of pUC7 (pCQN546W). Since this construct 
does not confer kanamycin resistance, kanamycin resistance was obtained by inserting the Bgl ll-PstI fragment 
of the APH ll gene into the Bam Hl-Pstl site (pCGN546X). This procedure reassembles the APHII gene, so that 
Eco sites flank the gene. An ATG codon was upstream from and out of reading frame with the ATG initiation 
codon of APHII. The undeslred ATG was avoided by inserting a Sau3A-Pstl fragment from the 5'-end of APHI I, 45 
which fragment lacks the superfluous ATG, into the Bam Hl-Pstl site of pCGN54eW to provide plasmid 
PCGN560. 

The EcoRI fragment containing the APH II gene was then cloned Into the unique Eco RI site of pCGN461, 
which contains an octopine synthase cassette for expression, to provide pCQN562 (1ATG). 

pCGN461 Includes an octopine cassette which contains about 1656 bp of the & non-cbdlng region fused via SO 
an Eco RI linker to the 3' non-coding region of the octopine synthase gene of pTlAg. The pTl coordinates are 
11,207 to 12.823 for the 3' region and 13.643 to 15.208 for the 5' region as defined by Barker etal.. Plant Moi, 
Biol. (1983) 2:325. 

The 5' fragment was obtained as follows. A small subcioned fragment containing ttte 5' end of the coding ; 
region, as a BamHI-EcoRI fragment was cloned In pBR322 as plasmid pCGN407. The Banri HI-EcoRI fragment 55 
has an Xmni site in the coding region, while pBR322 has two Xmn I sites. pCGN407 was dtgested with Xmnl. 
resected with Bal31 nuclease and Eco RI linkers added to the fragments. After EcoRI and Bam HI digestion, the 
fragments were size fractionated, the fractions cloned and sequenced. In one case, the entire coding region , 
and 10 bp of the 5' non-translated sequences had been removed leaving the 5' nort-trartslated region, the 
mRNA cap site and 16 bp of the 5' non-translated region (to a B&mHI site) Intect. This snfiii fragmernt was 60 
obtained by size fractionation on a 70/o acryiamlde gel and fragments approximately 130 bp ion^ eluted. 

This size fractionated DNA was Iigated int M13mp9 and several clones sequencisd and the sequence 
compared to the known sequence of the octopine synthase gene. The M13 corisfruct was designated pl4, 
which plasmid was digested with Bam HI and EcoRI to provid th small fragment which wte iigated to a Xhol 
to BamHI fragment containing upstream 5' sequenc s from pTiA6 (Garfink I and Nastier. J. Bacterlol . (1980) 65 
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144:732) and to an EcoRI to Xhol fragment containing the 3' sequences. 

The resulting Xhol fragment was cloned Int the Xho\ site of a pUC8 derivative, designated pCGN426. This 
plasmid differs from pUCS by having the sole EcoRI site filied in with DMA polymerase I. and having lost the Psti 
and Hindill site by nuclease contamination of Hindi restriction endonuclease, when a Xho l linker was inserted 

5 into the unique Hincil site of pUC8. The resulting plasmid pCGN451 has a single Eco RI site for the insertion of 
protein coding sequences between the 5' non-coding region (which contains 1.550 bp of 5' non-transcribed 
sequence including the right border of the T-DNA, the mRNA cap site and 16 bp of 5' non-translated 
sequence) and the 3' region (which contains 257 bp of the coding region, the stop codon, 196 bp of 3' 
non-translated DNA, the polyA site and 1,153 bp of 3' non-transcribed sequence). pCGN451 also provides the 

10 right T-DNA border. 

The resulting plasmid pCGN451 having the ocs 5' and the ocs 3' in the proper orientation was digested with 
Eco RI and the Eco RI fragment from pCGN551 containing the intact kanamycin resistance gene inserted into 
the" EcoRI site to provide pCGN552 having the kanamycin resistance gene In the proper orientation. 
This ocs/KAN gene was used to provide a selectable marker for the trans type binary vector pCGN587. 

IS The 5' portion of the engineered octopine synthase promoter cassette consists of pTiA6 DNA from the Xhol 
at bp 15208-13644 (Barker's numbering), which also contains the T-DNA boundary sequence (border) 
implicated in T-DNA transfer. In the plasmid pCGN587. the ocs/KAN gene from pCGN552 provides a 
selectable marker as well as the right border. The left boundary region was first cloned In M13mp9 as a 
Hindlll -Sma l piece (pCGN502) (base pairs 602-2213) and recloned as a Kpn f -EcoR I fragment in pCGN565 to 

20 provide pCGN580. pCGN565 is a cloning vector based on pUC8-Cm. but con taining pUC18 linkers. pCGN580 
was linearized with Bam HI and used to replace the smaller Bgill fragment of pVCK102 (Knauf and Nester, 
Ptasmid (1982) 8:45), creating pCGN585. By replacing the smaller Sail fragment of pCGN585 with the Xho l 
fragment from pCGN552 containing the ocs/KAN gene, pCGN587 was obtained. 
The pCGN594 Hindiil -Bam Hi region, which contains an 5'-ocs-kanamycin-ocs-3' (ocs is octopine synthase 

25 with 5' designating the promoter region and 3' the tenninator region, see U.S. application serial no. 775,923. 
filed September 13. 1985) fragment was replaced with the Hindlli-BamHI polylinker region from pUC18. 

pCGN566 contains the EcoRI-Hindlll linker of pUGI 8 inserted into the EcoRi-Hindlll sites of pUC13-Cm. The 
Hindlll-Bglll fragment of pNW31G-8.29-1 (Thomashow et aJ.. Ceil (1980) 19:729) containing 0RF1 and -2 of 
pTiA6 was subcloned Into the Hindlli -Bam HI sites of pCGN566 producing pCGN703. 

30 The Sau3A fragment of pCGN703 cont^ning the 3' region of transcript 7 (con-espondlng to bases 2396-2920 
of pTIAS (Barker etai., (1983) supra ) was subcloned into the Bam HI site of pUC18 producing pCGN709. The 
EcoRI-Smal polylinker region of pCGN709 was substituted with the EcoRI -Smal fragment of pCGN587, which 
contains the kanamycin resistance gene (APH3-iI) producing pCGN726. 
The EcoRI-Sall fragment of pCGN726 plus the Bglll-EcoRI fragment of pCGN734 were inserted into the 

35 Bam HI-Sail site of pUC8-Cm producing pCGN738. pCGN726c Is derived from pCGN738 by deleting the 900 bp 
EcoRI-EcoRI fragment. 

To construct pCGN167, the AIul fragment of CaMV (bp 7144-7735) (Gardner et al.. Nucl. Acid Res. (1981) 
9:2871-2888) was obtained by digestion with AIul and cloned into the Hincil site of M13mp7 (Messing eta[.. 
Nucl. Acids Res. (1981) 9:309-321) to create C614. An Eco RI digest of C614 produced the Eco RI fragment 
40 from C614 containing the 35S promoter which was cloned into the Eco RI site-of pUC8 (Vieira and Messing. 
Gene (1982) 19:259) to produce pCGN146. 

To trim the promoter region, the Bglll site (bp 7670) was treated with Bglll and resected with Bal31 and 
subsequently a Bglll linker was attached to the Bal31 treated DNA to produce pGGN147. 

pCGN148a containing a promoter region, selectable marker (KAN with 2 ATG's) and 3' region, was prepared 
45 by digesting pCGN528 with Bglll and inserting the Bam Hl-Bglil promoter fragment from pCGN147. This 
fragment was cloned into the Bglll site of pCGN528 so that the Bglll site was proximal to the kanamycin gene of 
pCGN528. 

The shuttle vector used for this constmct. pCGN528, was made as follows. pCGN525 was made by 
digesting a plasmid containing Tn5 which harbors a kanamycin gene (Jorgenson et al,. Mol. Gen. Genet 1979) 

SO 177:65) with Hindlli -Bam HI and inserting the Hindill -Bam HI fragment containing the kanamycin gene into the 
Hindin -Bam HI sites In the tetracycline gene of pACYC184 (Chang and Cohen. J. Bacterioi. (1978) 
134:1141-1156). pCGN526 was made by inserting the Bam HI fragment 19 of pTiA6 (Thomashow et a!., Cell 
(1980) 19:729-739). modified with Xhol linkers inserted into the Sma i site, Into the BamHI site of pCGN526. 
pCGN528 was obtained by deleting the small )(hol fragment from pCGN526 by digesting with Xhol and 

$5 religating. 

pCGN149a was made by cloning the Bam HI-kanamycin gene fragment from pMB9KanXXI into the BamHI 
site of pCGN148a. 

pMB9KanXXI is a pUC4K variant (VieIra and Messing. Gene (1982) 19:259-268) which has the )(hol site 
missing but contains a functional kanamycin gene from Tn903 to allow for efficient selection in Agrobacterium . 

60 pCGN149a was digested with Bglll and Sphl. This small Bglll-Sphl fragment of pCGN149a was replaced with 
the Bam HI-Sphl fragment from Ml (see below) isolated by digest' n with Bam HI and Sph l. This produces 
PCGN167, a construct containing a full I ngth CaMV promoter, 1ATG-kanamycin gene, 3' nd and the bact rial 
Tn903-type kanamycin gene. Ml is an EcoRI fragment from pCGN546X (see construction of pCGN587) and 
was cloned into the Eco RI cloning site of M13mp9 in such a way that the PstI site in the 1 ATG-kanamycin gene 

65 was proximal to the polylinker r gion of M13mp9. 
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The HIndlll -Bam HI fragment In the pCGN167 containing the CaMV-^ promoter. 1ATG-kanamycin gene 
and th BamHI-fragment 19 of pTiAS was cloned into the Bam HI-Hin dill sftes of pUC19 creating pCQKl976. Th 
35S promoter and 3' region from transcript 7 was developed by Inserting a 0.7 kb HindKI-fec Rl fragment of 
PCGN976 (35S promoter) and the 0.5 kb EcoRI-Saul fragment of pCGN709 (transcript 7:30 Into the Hlndlll-Sall 
sites of pCGN566 creating pCGN766c. ' ~ 5 

The 0.7 kb Hlndlil-EcoRI fragment of pCGN766c (Ca!^-35S prompter) was Ifgated to the 1.5 kb EcoRl^ail 
fragment in pCGN726c (1ATG-KAN 3' region) followed by insertion Into the Hindlli-Sali: sitea of pUC119to 
produce PCGN778. The 2.2 kb region of pGGN778. Hindlil-Sall fragment containing the.CaMV-^ promoter 
and 1 ATG-KAN-3' region was used to replace the HindlH-Sall linker region of pCGr^739 fi?^ produce pGGN783. 

' 10 

Transfer of the Binary Vector pCGN948 Into Agrobacterlum 

pCGN948 was introduced into Agrobacterlum tumefaciens EHA101 (Hood et ai.. J. BacterioL (1986) 
168:1291-1301) by transformation. An overnight 2 ml culture of EHAIOi was grown in MG/L broth at 30" C. 0.5 
ml was inoculated into 100 ml of MQ/L broth (QarfinWfel and Nester, J. Bacterid. (1980) t44:732-;743) and grown 
in a shaking incubator for 5 h at 30*C. The cells were pelleted by centrtfugafldh at resuspended in 1 ml of is 
MG/L broth and placed on ice. Approximately. 1 jig of pGGN948 DNA was placed in 100:jil of MG/L broth to 
which 200 pj of the EHAIOI suspension was added; the tube containing the DNA-^sell.rnlx was Immediatefy 
placed into a dry Ice/ethano! bath for 5 minutes. The tube was quick thawed by 5 minute^ in St^C water bath 
followed by 2 h of shaking at 30** C after adding 1 mi of fresh MG/L medium. The cells were pelleted and spread 
onto MG/L plates (1.50/o agar) containing 100 mg/l gentamlcln. Plasmid DNA was Isolated from individual 20 
gentamicin-reslstant colonies, transfomned back into E. coli. and characterized by restriction enzyme analysis 
to verify that the gentamicin-reslstant EHA101 contained Intact coples of pCGM948i Single colonies are picked 
and purified by two more streakings on MG/L plates containing 100 mg/l gentamlcln. 

Transformation and Regeneration of B. Napus 26 

Seeds of Brassica napus cv Westar were soaked in 95o/o ethanol for 4 minutes. They were sterilized In 1o/o 
solution of sodium hypochlorite with 50 ^1 of "Tween 20" surfactant peir 100 ml sterilent sotutlon. After soaking 
for 45 minutes, seeds were rinsed 4 times with sterile distilled water. They were pSantad In sterile plastic 
boxes 7 cm wide. 7 cm long, and 10 cm high (Magenta) containing 50 mi of 1/10tfi conceritratipn of MS 
(Murashige minimal organics medium, Gibco) with added pyridoxins (500 jig/l) . nicotinic acid (50 jig/l) . glycine 30 
(200 jig/l) and solidified with O.60/0 agar. The seeds genninated and were grbwn at 22'C In a 16h-8h ilght-dari< 
cycle with light intensity approximately 65 p£m-2s-^ After 5 days the seedlings were. taken under sterile 
conditions and the hypocotyls excised and cut Into pieces of about 4 mm In length. The.hypocotyl segments 
were placed on a feeder plate or without the feeder layer on top of a filter paper on the fiofdified B5 0/1/1 or 
B5 0/1/0 medium. B5 0/1/0 medium contains B5 salts and vitamins (Gamborg, Miller isa^d OJiim. ExperimentaJ 35 
Cell Res. (1968) 50:151-158). 30/0 sucrose. 2,4-dIchIorophenoxyacetic acid (1.0 mg/l). pH adjiisted to 5.B, and 
the medium is solidlfed with O.60/0 Phytagar; B5 0/1/1 is the same vyith the addition of ';l,Q ipg/I ktnetin. I=eeder . 
plates were prepared 24 hours in advance by pipetting 1 .0 mi of a stationary phase tobacco $uspansIon culture 
(maintained as described In FiilattI et ai., Molecular General Genetics (1987) 206:1 9?-1 99) onto BS 0/1/0 or 
B5 0/1/1 medium. Hypocotyl segments were cut and placed on feeder plates 24 hwrs prior to Agrobacterium 40 
treatment. 

Agrobacterium tumefaciens (strain EHA101 x 948) was prepared by incubating, ia single colony of 
Agrobacterium in MG/L broth at 30* C. Bacteria were harvested 16 hours later and dilutions of 10^ bacteria per . 
ml were prepared in MG/L broth. Hypocotyl segments were inoculated with bacteria by placing the segment^ 
in an Agrobacterium suspension and allowing them to sit for 30-60 minutes, then removing and transferring to 45 
Petri plates containing B5 0/1/1 or 0/1/0 medium (0/1/1 Intends 1 mg/1 2,4-D and 1 mg/1 kinetin and 0/1/6 
intends no kinetin). The plates were incubated In low light at 22* C. The qb-lncgbatlon of bacteria with the 
hypocotyl segments took place for 24-48 hours. The hypocotyl segments, were removed and placed on 
B5 0/1/1 or 0/1/0 containing 500 mg/I carbenicIHih (kanamycin sulfate at 10, 25, or 60 mg/l was sometimes 
added at this time) for 7 days in continuous light (approximately 65 ^LEm-^S-**) at 22"C. Tne' segments were 50 
transferred to B5 salts medium containing i^/o sucrose, 3 mg/l benzylamlrio purine (BAf>J and 1 mg/l zeatla. 
This was supplemented with 500 mg/l carbeniclllln, 10. 25, or 60 mg/l kahanrtycln sulfate, and solidified with 
0.60/0 Phytagar (Gibco). Thereafter, explants were transferred to fresh medium evei7 two weeks. 

After one month green shoots developed from greeri calll which were selected .pn media containing 
kanamyicih. Shoots continued to develop for three months. The shoots were cut from th^a!ti.v)fhen tjiey were 55 
at least 1 cm high and placed on B5 medium with I^Vb sucrose, no added growth^suDsfan^» 300 mg/i 
carbenciiiin , and solidified with O.60/0 phytagar. The shoots continued to grow and §evera} leaves were 
removed to test for neomycin phosphotransferase II (NPTll) activity! Shoots yvhlch were positive* for NPTII . 
activity were placed In Magenta boxes containing B5 0/1/1 mediurrv with I0/0 sudrose, 2 rtig/l Indolebutyric * 
acid. 200 mg/l carbeniciilln. arid solidified with O.60/0 Phytagar. Aftqr a few weeks the shopts developed roots 60 
and were transferred to soil. The plants were grown in a growth chamber at 22**C in a 1&^ hours light-dark 
cycle with light Intensity 220 jiEm-^s-^ and after several w $k8 were transfen-ad io the. gree.nh us . ► ' 
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Southern Data 

Regenerated B. napus plants from cocultivations of Agrobacterium tumefaciens EHA101 containing 
pCGN948 and B. napus hypocotyls were examined for proper Integration and embryo-specific expression of 
ttie spinach leaf ACP gene. Southern analysis was performed using DNA Isolated from leaves of regenerated 
5 plants by the method of Dellaporta et a[.. ( Plant Mol. Biol. Rep. (1983) 1:19-21) and purified once by banding in 
CsCI. DNA (10 \iQ) was digested with the restriction enzyme EcoRI, electrophoresed on a 0.70/o agarose gel 
and blotted to nitrocelluios (see Maniatis et ai.. (1982) supra .). Blots were probed with pCGN945 DNA 
containing 1.8 kb of the spinach ACP sequence*or with the EcoRI/ Hindlll fragment isolated from pCGN936c 
(made by transferring the Hindlll/EcoRi fragment of pCGN930 Into pCGN566) containing the napin 5' 

10 sequences labeled with 32p-cJCTP by nick translation (described by the manufac turer, BRL Nick Translation 
Reagent Kit, Bethesda Research Laboratories. Gaithersburg. MD). Blots were prehybridized and hybridized in 
500/0 formamide, 10x Denhardt's SxSSC. O.io/o SDS. 5 mM EDTA. 100 ^tg/mi calf thymus DNA and lOO/o dextran 
sulfate (hybridization only) at 42*'C. (Reagents described in Maniatis et al., (1982) supra .) Washes were In 
IxSSC, 0.10/0 SDS. 30 min and twice in O.lxSSC. 0.10^ SDS at 55**cr~ 

15 Autoradiograms showed two bands of approximately 3,3 and 3.2 kb hybridized in the EcoRI digests of DNA 
from four plants when probed with the ACP gene (pCGN945) indicating proper Integration of the spinach leaf 
ACP construct in the plant genome since 3.3 and 3.2 kb Eco RI fragments are present in the T-DNA region of 
pCGN948. The gene construct was present in single or multiple loci in the different plants as judged by the 
number of plant DNA-construct DNA border fragments detected when probed with the napin 5' sequences. 

Northern Data 

Expression of the Integrated spinach leaf ACP gene from the napin promoter was detected by Northern 
analysis In seeds but not leaves of one of the transformed plants shown to contain the construct DNA. 
Developing seeds were collected from the transformed plant 21 days post-anthesfs. Embryos were dissected 

25 from the seeds and frozen in liquid nitrogen. Total RNA was isolated from the seed embryos and from leaves of 
the transformed plant by the method of Crouch et al., Viroiogy (1985) 140:281-288) and blotted to 
nitrocellulose (Thomas, Proc. Natl. Acad. Sci. USA (1980) 77:5201-5205). Blots were prehybridized, hybridized, 
and washed as described above. The probe was an isolated Pstl/BamHI fragment from pCGN945 containing 
only spinach leaf ACP sequences labeled by nick translation. 

30 An RNA band of -^0.8 kb was detected in embryos but not leaves of the transformed plant indicating 
seed-specific expression of the spinach leaf ACP gene. 

Example II 

35 Construction of B. Campestris Napin Promoter Cassette 

A ^11 partial genomic library of B. campestris DNA was made In the lambda vector Charon 35 using 
established protocols (Maniatis etal., 1982, supra ). The titer of the amplified library was - 1.2 x 10^ phage/ml. 
Four hundred thousand recombinant bacteriophage were plated at a density of 10^ per 9 x 9 in. NZY plate 
(NZYM as described in Maniatis etal., 1982. supra) in NZY + 10 mM MgS04 + 0,90/o agarose after adsorption 

40 to DH1 E. coil cells (Hanahan, Mol. Biol . (1983) 166:557) for 20 min at 37" C. Plates were incubated at 37"C for 
-'IS hours, cooled at 4**C for 2.5 hours and the phage were lifted onto Gene Screen Pius (New England 
Nuclear) by laying precut filters over the plates for approximately 1 min and peeling them off. The adsorbed 
phage DNA was Immobilized by floating the filter on 1.5 M NaCI, 0.5 M NaOH for 1 min.. neutralizing in 1.5 M 
NaCI. 0.5 M Tris-HCl, pH 8,0 for 2 min and 2XSSC for 3 min. Filters were air dried untif just damp, prehybridized 

45 and hybridized at 42*' C as described for Southern analysis. Riters were probed for napln-containing clones 
using an Xhol/Sall fragment of the cDNA clone BE5 which was Isolated from the B. campestris seed cDNA 
library described using the probe pN1 (Crouch etal., 1983, supra) . Three plaques were hybridized strongly on 
duplicate filters and were plaque purified as described (Maniatis et al., 1982, supra ). 
One of the clones named lambda CGN1-2 was restriction mapped and the napin gene was localized to 

50 overlapping 2.7 kb Xhol and 2,1 kb Sail restriction fragments. The two fragments were subcloned from lam bda 
CGN1 -2 DNA into pCGN789 (a pUC based vector the same as pUC1 19 with the nonnal polyllnker replaced by 
the synthetic linker -5' GGAATTCGTCGACAGATCTCTGCAGCTCGAGGGATCCAAGCTT 3' (which repre- 
sents the polyllnker EcoRI. Salf, Bglll, Pstl. Xhol. BamHI, Hindlll). The Identity of the subclones as napin was 
confirmed by sequencing. The entire coding region sequence as weil as extensive 5' upstream and 3' 

55 downstream sequences were determined (Figure 2). The lambda CGN1-2 napin gene Is that encoding the 
mRNA corresponding to the BE5 cDNA as determined by the exact match of their nucleotide sequences. 

An expression cassette was constructed from the 5'-end and the 3'-end of the lambda CGN1-2 napin gene 
as follows In an analogous manner to the construction of pCGN944. The majority of the napin coding region of 
pCGN940 was deleted by digestion with Sail and religation to form pCQN1800. Single-stranded DNA from 

60 pCGNISOO was used in an in vitro mutagenesis reaction (Adelman et al., DNA (1983) 2:183-193) using the 
synthetic oligonucleotide 5' GCTTGTTCGCCATGGATATCTTCTGTATGTTC 3'. This ongonucleotlde Inserted an 
EcoRV and an Nco l restri tion site at the junction of th promoter region and the ATQ start codon of the naipin 
gene. An appropriate mutant was identified by hybridization to the oligonucleotide used for the mutagenesis 
and sequence analysis and named pCGN1801. 

65 A 1 .7 kb promoter fragment was subcloned from pCGN1801 by partial digestion with EcoRV and ligation to 
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pCGN786 (a pCGN566 chloramphenicol based vector with the synthetic linker described above in place of the 
normal polylink r) cut with EcoRI and blunt d by fill# # resulting expression cassette, pCGN1803 contains 
1.725 kb of napin promoter sequence, and 1.265 kb of napin 3' sequences with the unique cloning sites Sail, 
Bgi i, Psti. and Xhol in between. Any sequence that requires seed-spectfic transcription or expression in 
Brassica, i.e., a fatty acid gene could b inserted in this cassette in a manner artalogous to that describ d for s 
spinach leaf ACP and th B. napus napin cassette in Example 1. 

Example ill 

Other seed-specific promoters may be Isolated from genes encoding proteihs involved hi seed 
triacyiglycerol synthesis, such as acyl carrier protein from Brassica seeds. Immature seed werlB collected from 10 
Brassica campestris cv. "R-SOO," a seif-compatible variety of turnip rape. Whole seeds were collected at 
stages corresponding approximately to 14 to 28 days after flowering. RNA isolation and prepafation of a cDNA 
bank was as described above for the isolation of a spinach ACP cDMA clone 'except that the vector used was 
pCQN565. To probe the cDNA bank, the oligonucleotide (50-ACTTTCTCAACTGTCTCTGGTTTAQCAGC-(3') 
was synthesized using an Applied Biosystems DNA Synthesizer, model 380Ai according to martufacturer's is 
recommendations. This synthetic DNA molecule will hybridize at low stringencies to E>NA or RNA sequences 
coding for the amino acid sequence (ala-ala-Iys-pro-g(u-thr-val-glu-lys-val). This amino acid sequence has 
been reported for ACP isolated from seeds of Brassica napus (iSlabas et al., 7th fntemational' Symposiuni of 
the Structure and Function of Plant Lipids, Unrversrty of Calif omta, Davis, CA, 198S); ACP from B. campestris 
seed is highly homologous. Approximately 2200 different cDNA clones were analyzed using a colony 20 
hybridization technique (Taub and Thompson, Anal. Blochem . (1982)* 128-.222-230) and hybridization 
conditions coaesponding to Wood et al., ( Proc. Natl. Acad. Sci . (1985) 82:1585-1588). ONA sequence analysis 
of two cDNA clones showing obvious hybridization to the oUgonucieoticfe: prbbe Indicated that one, 
designated pCGNIBcs, indeed coded for an ACP-precursor protein by the considefSble homology of the 
encoded amino acid sequence with ACP proteins described from Brasstoa napus (Sfefias et al., 1980 supra ). i5 
Similarly to Example II, the ACP cDNA clone can be used to isolate a genonlfc clone frpm whfcfTan expression 
cassette can be fashioned in a manner directly analogous to the B, campestris n^h cassette. 

other Examples 

Ninety-six clones from the 14-28 day post-anthesis B. campestris seed cDNA library (deiscribed in the 30 , 
previous example) were screened by dot blot hybridization of minlprep DNA on Gene Screen Pkjs nylon filters. 
Probes used were radioactively labeled first-strand synthesis cDNAs made from the day >4-28 post-anthesis 
mRNA or from B. campestris leaf mRNA. Clones which hybridized strongly to seed cONA and Bttle or not at all 
to leaf cDNA were catalogued. A number of clones were identified as repireSentfng the seed storage protein 
napin by cross-hybridization with an Xhol/Sall fragment of pNI (Crouch et aJ., 1983, ^pra ), d B. campestris 35 ' 
genomic clone as a source of an embryo-specific promoter. ^ 

Other seed-specific genes may also serve as useful sources of promoters. cDNa clones of cruciferin, the 
other major seed storage protein of B. napus , have been identified (SItifion et d., 1985, suptia ) and could, be 
used to screen a genomic library for promoters. 

Without knowing their specific functions, yet other cDNA clones can be classified as to their level of 40 
expression in seed tissues, their timing of expression (lie., when post-anthesis they are expressed) and their 
approximate representation (copy number) in the B. campestris genome. Clones fitting the criteria necessary 
for expressing genes related to fatty acid synthesis or other seed functions can be used to screen a genomic 
library for genomic clones which contain the 5' and 3' regulatory regions necessary for expression. The - 
non-coding regulatory regions can be manipulated to make a tissue-specific expression cassette in the ^ 
general manner described for the napin genes in previous examples. 

One example of a cDNA clone Is EA9. It is highly expressed in seeds and not leaves from B. campestris . It 
represents a highly abundant mRNA as shown by cross-hybridization of seven other cDNAs from the library by 
dot blot hybridization. Northern blot analysis of mRNA Isolated from day 14 sieerf, and day 21 and 28 
post-anthesis embryos using a 700 bp Eco RI fragment of EA9 as a probe shows that EA9 is highly expressed 50 
at day 14 and expressed at a much lower level at day 21 and day 28. The restriction map of EA9 was determined 
and the clone sequenced. Identification of a poiyadenyiafion signal and of polyA tails at the 3'-end of EA9 
confinms the orientation of the cDNA clone and the direction of transcription of the mRNA. The partial 
sequence provided here for clone EA9 (Figure 3) can be used to synthesize a probe which willldentify a unique 
class of Brassica seed-specific promoters. 55 

It is evident from the above results, that transcription or expression can be obtained specifically In seeds, so 
as to permit the modulation of phenotype or change in properties of a product of seed, particularly of the ' 
embryo. It is found that one can use transcriptional initiation regions associated with the transcription of 
sequences in seeds in conjunction with sequences other than the nomnal sequence to produce endogenous 
or exogenous proteins or modulate the transcription or expression of nucleic acid sequences. In this manner. 60 
seeds can be used to produce novel products, t provide for impr ved protein compositioris, to rn dify the . ' 

distribution f fatty acid, and the like. 

All publicatons and patent applications mentioned in this specification are Indicative of the level of skill of 
those skilled in the art to which this invention pertains. All publications and patent applicati ns are hereiri 
incorporated by referenc to the same extent as if each individual pubilcafion or patent application was 65 
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specifically and individually Indicated to be incorporated by reference. 



5 Claims 

1 . A seed comprising an xpression cassette, said cassette comprising a seed specific transcriptional 
Initiation region, a sequence of interest under fhe transcriptional regulation of said initiation region, and a 
transcriptional termination region, said expression cassette inserted into tlie genome of said seed at 

10 other than the natural site for said transcriptional initiation region. 

2. A seed according to Claim 1 . wherein said sequence of Interest is an open reading frame encoding an 
endogenous protein or mutant thereof. 

3. A seed according to Claim 1 , wherein said sequence of interest is an open reading frame encoding an 
exogenous protein. 

75 4. A seed according to Claim 1 , wherein said sequence of interest encodes a sequence complementary 

to a transcription product of said seed. 

5. A seed according to Claim 1 . wherein said seed is of the Brassica family. 

6. An expression construct comprising a seed specific transcriptional initiation region, a polylinlcer of 
less than about 100 bp having at least two restriction sites for Insertion of a DNA sequence to be under 

20 the transcriptional control of said initiation region, and a transcriptional termination region, the sequence 

of said polylinker being other than the sequence of the gene naturally under the transcriptional control of 
said initiation region. 

7. An expression construct according to Claim 6, wherein said transcription Initiation region is from a 
Brassica seed gene. 

25 B. An expression construct according to Claim 7. wherein said gene is a napin gene. 

9. An expression cassette comprising a seed specific transcriptional Initiation region, a DNA sequence 
of interest, other than the natural sequence joined to said initiation region, to be under the transcriptional 
control of said initiation region, and a transcriptional termination region. 

10. An expression cassette according to Claim 9, wherein said transcriptional Initiation region Is a 
30 Brassica gene Initiation region. 

1 1 . An expression cassette according to Claim 10. wherein said gene Is a napin gene. 

1 2. An expression cassette according to Claim 10, wherein said sequence of interest is a structural gene. 

13. An expression cassette according to Claim 12, wherein said structural gene encodes a protein in the 
biosynthetic pathway for fatty acid production. 

35 14. An expression cassette according to Claim 13, wherein said protein is acyl carrier protein. 

15. A vector comprising an expression cassette according to any of Claims 9 to 14, a prolcaryotic 
replication system, and a marker for selection of transformed prokaryotes comprising said marker. 

16. A method for modifying the genotype of a seed comprising: 

growing a plant to seed production, wherein cells of said plant comprise an expression cassette 
40 according to any of Claims 9 to 1 4. 

whereby seeds are produced of modified genotype. 



KAP-270 Linear LENGTB - 702 

SfaNl 

rolcl Maelll MboII 

I I r I 

1 AAAGGGATGTGTCTTGGTATGTATCT^CC»GTAACAAAAQAGAASATGCRfttTC • 

36 

Alul Ahalll rokl EcoRV Hgar 

I i 111 
72 AGAGCTTTT45AAGCC34TCA5GTGTGTGCTTT4ATCITATTfiATATCATCCMT4GCGT?GT'rTAATeC5 1 
7« 62 108 117 130 

Ddel 

143 TCTTTAGATATGTTTCTGTTTCTTTCTCAGTGTCTGAATA1CT6ATAAGTGCAATGTGAGAAAGCCACACC 2 

169 

TaqI 

Sapl Hlnfl 

214 AAACCAAAATATTCAAATC7TATAWTTTAATAAIGICGAAICACTCG6AGTTGCCACCTTCT6TGCCAAT 2 
224 253 

251 

HinfZ MboII Eeoia 

I I I 

285 TCTGCTGAATCTATCACACTAAAAAAAACATTTCTTCAAGGTAATGACTTGtGGACIATCGtTCICAATTC 3 
292 310 331 

. Maall 

356 TCATTAAGTITTIATTTTTTGAAfiWTAAGTTTTTACCTTCTTTTTIGAAAAATATCGTTCATAAGATGTC 4 

424 

■ . Sphl 
Scrn Nlalll Kip (7524) I 

EcoRii Alul sfaNI Nlalll Kaeizi 

II II I I i 

427 ACGCCAGGACATGAGCTACACATCACATATTAGCATGCAGAIGCGGACGAIVTGTCACTCACTTCAAACAC 4 
,430 442 457 464 480 
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(Page'i of 2) 



0255378 



. Sap!' PmaCI 
• « . . SphI 'Mbol 

TthUlll Ndtl H«p{7524)l Maall 

Avalll Ulaiii Afllll Opnl Nlalli 

498 CTAAAAGA<^TTCTCTCTCy«:ASCACAau:ACATXTGCATGCAATATTTACAC^^ 5 
507 , 532 539 548 555 563 . 

508 531 539 550 

539 553 
.• 543 . 5Sl 

BphI Mnll 
II 

569 TCCATTCTOWCTATAAATTAGAGGCTCGGCTTCACIITTTACTCAAACCAAAACICATCACTACA^ 6 
569 583 

Mboll Alul Mnll 

'I I 
640 TACACAA&SfiGCGAACAAeCTCITCCTCGTCTCGGCAACTCTCGCCTTGTTCTTCCTTCTCAC 702 
653 659 675 
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Lambda CGNl-2 

NCS-186 Linear LENGTH -4325 

Xhol 

Taql Hindlll 
Aval Alul Taqi 

II Mil 
1 CTCaA6GCX(STCACTMCAT6AAGTTTGXCGACGACCCCAACTATGGGAA6CT7ATTTCTC7T7TCCXT 6d 

2 92 06 

3 50 

2 

Sad : 

fihal xbaZ Alul 

I I II 
70 AC7CTAAT7GA6CCGTGCGCTCTATCTAGACCAATTA(^TT.GATGGAGCTCTAAAl6GTTGCTGGCTGT 138 

89 95 m 

121 

NdeX Kdel 
I I. . 

139 TTTCTTGTTCATATGATTAACTTCTAAACTTGrGTATAAATATTCTCTGAAAGTGCVTCTTTTGGCATA :207 

150 206 
208 TGTAGGTTGGGCAAAAACGAGGAAGATT6CTTCTCAATTTGGAAGA6GATGAACAGCCGAAGAAGAAAA 276' 



Sau3AI 
Ddel 

277 TAAGAATAGGCA67CCTGCTACTCAATGGATCTCAGTCTATAAC66TC6ICGTCCCAT6AAACAGAGGT 34S 

309 

305 

EcoRV 
I 

346 AAAACA7TTTTTGCATAT ACACTSTGAAAGTTCCTCACTAACTGT6TAA7CTTT7GGTASATATCACTA 414 

408 
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RlftCZI 
Hhal 

K&eiil 

Ddel 
BstEII 

Ball * RaelZI Alul 

Mi .1 I 

415 CAAT6TC6GAGA(3ACAA3GGCTGMKCM7CATATACAmQG0AAATGAAGATG6CCT7TTGATTA6CTa 483 

439 469 m 

438 

439 

439 
440 
438 

Alul Blnfl 
I I 
484 TGTAGCATCAGCAGCTAATCTCTGGGCTCTCATCATGGATGCTGGAACTGQATTCACTTCTCAXGTTTA 552 

498 535 
Mspl 

Hpall Hinfl 

I I 
553 TGAGTTGTCACCGGTCT7CC7ACACAA60TAATAATCAGTIGAAGCAATTAAGAATCAATTTGATTT6T 621 

564 606 
564 

Ddel 

622 aqtaaactaagaagXacttaccttatgttttccccgcaggactggatsatggaacaatgggaaaagaac 690 
629 

SaoX 

Alul AluZ Alul 

691 TACTATATAAGCTCCATAGCTGGTTCAGATAACGGGAGCTCTTTAGTTGTtrATGTCAAXAGGTTAGTGT 759 

702 710 729 

731 

760 TTAGTGAATAATAAACTTATACCACAAAGTCTTCATTGACTTATTTATATACTTGTTGTGAATTGCTAG 828 



Ddel Binfl 
829 GAAC7ACTTATTCTCAGCAGTCATACAAA67GAQTGACTCATTTCCGT7CAAGTGGA7AAATAAGAAAT 697 
842 665 
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Xmni jjql 
I I . 

898 GQAAAGAAGATTTTCATGTAACCTCCATGACAACTGCTGGTAATC9TTGGSOTGTOOTAATGTCGAGC5A 966 

908 . 961' 

5au3AI 
Bell 

.1 

967 ACTCTGGCTTCTCTGATCAGGTAGGTTTTTGtCTCTTATTGTCTGGTQTTTTTA^TTTTCCCCTGATAQT 1035 

981 
981 

Alul Seal 

i036 CTAATAT6ATAAACTCTGCGTTGTGAAAGGT6GTGGAGCTT6ACTTTTTGTACCCAAGCGATGGGATAC 1104 

1074 1087 

$au3AZ Alul 

1105 ATAGGAGGTGGGAGAATGGGTATAGAATAACATCAATGGCAGCAACTGCGQATCAAGCAGCTITCATAT 1173 

1155 1165 

Hln£]C R«al 

1174 TAAGCATACCAAAGCGTAAGATGGTGGATGAAACTCAAGAGACTCTCCGCACCACCQCCTTTCCAAGTA 1242 

1219 1242 

1242 . 

Alul 9au3AX Ei4«x 

1243 CTCATGTCAAGGTTGGTTTCTTTAGCTTTGAACACAGATTTGGATCMTTTGTTTTOTTTCCATATACT 1311 

1268 1285 1311 

Dd«I 

Avail Alul Hin<I Raal 

III II 
1312 TAGGACCTGAGAGCTTTTGGTTGATTTTTrrTTCAGGACAAATGGGCGAAGAATCTGTACATTGCATCA |l380. 

1315 1325 1363 1370 

1319 

1381 ATATGCTATGGCA6GACAGTGT6CTGATACACACTTAAGC&TCA7GTGGAAA6CCAAAGACAATTGGA6 1449 
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Hinfl 
Ddei 

II ^ . • 

1450 CGAGACTCAGGGTCGTCATAATACCAATCAAAGACGTAAAACCAGACSCAACCTCTTT6GTTGAATGTA 1518 

1456 
1454 

Rsal 

1519 ATGAAAGGSATGTGTCTTGGTATGTAT6TACGAATAACAAAACAGAAGATGGAATTAGTAGTAGAAATA 1587 

1548 

Alul ECORV 
i I 
1588 TTTGGGAGCTTTTTAAGCCCTTCAAGXOIGCTTTTTATCTTATTGATATCATCCATTTGCGTTGTTTAA 1656 

15S6 1635 

Xbal Ddol 

I I 
1657 TGCGTCTCTAGATATGTTCCTATATCTTTCTCAGTGTCTGATAAGTGAAATCT6AGAAAACCATACCAA 1725 

1664 1687 

Blnfl 

172 6 ACCAAAATATICAAMCTTAITTTTAATAATGTTGAATCACTCGGAGTTGCCACCTTCTGTGCCAATTQ 1794 

1761 

HlnfZ EcoRI 
I I 
1795 TGCTGAATCTATCACACTAGAAAAAAACATTtCTTCAAGQTAATSACTTGTGGACTATGrrCTGAATTC 1863 

1800 1859 
1864 TCATTAAGTTTTTATTTTCTGAA6TTTAAGTTTTTACCTTCT6TTMGAAATATATC0TTCATAAGATG 1932 



SphI 

BstNI Alul Sau3AI 
1933 TCACGCCAGGACATGAGCTACACATCGCACATAGCATGCAGATCAGOACGATTTGTCACTCACTTCAAA 2001 

1940 1950 1973 

1971 
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- , Sphi . 

Dd«I Alul Hhal Nd«I M«il Sau3AI 

II I I'll I 

2002 CACCTAAGAGCTTCTCTCTCACAGCGCACACACATATGCATSCAATATmCACGTGATCGCCATGCAft 2070 

2006 2012 2028 2036 2042 2058 

2044 

2071 ATClCCATTCTCACCTATAAATTAGAGCCTCGGCTTCACTCTTTACSGAAACC?ftAJUlCTCATCACTi^ 2139 



AluZ 

2140 GAACATACACAAATGOCGAACAAGCTCTTCCTCOTCTCGQCAACTCTCGCCTTGTTCTTCCTTCTCACO 2208 

METAlaAanLyaLeuPhaLeuValSerAlaThrLfluAlaieittPhtPlieLeuLeuThr 
2164 

Taqi Kaei 
Sail Mspl 
RlnoXl Bpaiz 
Acci ^^Accl Haelll 

2209 iWVTGCCTCCGTCTACAGGACGGTTGTGGAAGTCGACOAAGATGATGCCACAAAT^^ 2277 
AaftAlaSarValTyrArgThrValValGl«ValAflp61uA«pA8pMaThrA.nPtoAlSly» r^^ 
2220 2241 2271 

2239 2268 
2240 2268 
2241 2269 

Bj Hindlll 

I I I "■ 

2278 AGGATTCCAAAAT6TAGGAAGGA6TTTCAGCAAGCACAACACCTGAAAGCTT0CCAACAATGGCTCCAC 2346 

^iiS?"^y*^"^^^y"^^"^'^*^^'^^^^*'^^*^«I*«^y«AlaCy«Gln61nTrpL«uHif 
2281 2327 

2325 

Mspl Avail 
fipall Avail Alul TaqI 

2347 AACCAGGCAATGCAGTCCGGTAGTGGTCCAAGCTGOACCCTCGATGGTQAGTtTGATTTTQAAGACGAC 2415 
LyaGlnAlaMETGlnSarGlySarGlvProSerTrpThrLeuAapGlyGluPheAapPheGluAspAsp 
2364 2372 2379 2388 ■ 
2364 2382 
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HaftXXI SacI 

^al flaelll Alui 

i I I II. 

2416 GTGGAGAACCAACAACA€G€CCC6CA6CAaAGGCCACC6CTGCTCCAQCAGTSCT6CAAC6A6CTCCAC 2404 
ValGluA«nQlnQinGlnQlyvroGlnGlnArgPzoPxoLauL«uGInGlnCy8Cy«AafiGluLauHli» 

2438 2449 247$ 

2436 .- 2481 ■ 

BatNI Hittfl 
I II 
2485 CAGGMGAGCCAC7TTGCGTTTGCCCAACCTTGAAA6GA6(»kX.CCAJULGCCGTTAA&CAACAG&TTCGA 2SS3 
QlnGluGluProLauCyaValCyaFffoThrLeuLysGlyAlasarLysAlaVaUyaGlnGlnllaAr^ 
2486 2UZ 

2S5i . 

2554 CAACAACAfiGGACAACAXXTGCAGGGACAGCAGATaCAGCAAGTGATTAGCCGTATCTACCAGACCGCT 2622 
QlAGlni31n61yGlnGlnMSTGlnGlyGinGinKETGlnGlnValZliiS«rAr9Zl«TyrOlnTh£Ala 

Alui Bstisi 
1 .1 
.2623 ACGCACTTACCTAGAGCTTGCAACATCAGGCAAGTTAGCATT^GCCCCTTCCAGAAGACCATGCCTGGQ 2691 
'jmrHlaLauS'roAxgAlaC^aAsnllaArgGlnvalSerzleCyaProFheGlnLytThzlffiTPzo^^^ ' 

Mspl 

Bpall X&ol 

Haalll laqi 

Apex Hlnfi Aval AceZ 

II I II • I 

2692 CCCGGCTTCTACTAGATTCCAAACGAATATCCTCGAGAG7GTGTATACCAC(»TGATA7GAGTGTGGTT 2760 
ProGly7h«Tyz , 

2694 2707 2724 2736 

2692 8725 

2694 2724 

2694 

RlneZZ RaaZ: 
2761 6TT6ATQTATG?TAACACTACATAGTCATGGTGTGTGT7CCA7AAATAATGTACTAATGTAATAAGAAC 2829 
2771 2813 

AccZ 
I 

2830 TACTCCGTA6ACGGTAATAAAAGAGAAGTTTT7TTTTTTACTCTTGCTACTTTCCTATAAAGTGATGAT 2898 
2638 
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Sdal 
RaaZ. 

2899 TAACAACAGATACACCAAAAAGAAAACAAXTAATCTATATTCACAATGAAGCy^GTACTAGTCTATTGAA 29$7 

. . 2954 . 
2954 

Sau3AT 

.2968 CATGTCAGATTTTCTTTTTCTARATGTCTAAWAAGCCTTCAAfiGCTAGT(»TSAtAAAAGATCATCCA 303 6 

3028 

Sau3Al SauSAI 
BamHl Hlnf Z Bell 

3037 ATGGSATCCAACAAAGACTCAAATCTGGTTTTGATCAGATACTTCAAAACTATTTTTGTATTCATTAAA 3105 

3041 3053 3069 

3041 3069 

Hiofl 

3106 TTAT6CAAGTGTTCTTTTATTTGGTGAAGACTCTTTAGAAGCAAAGAACGACAAGCAGTAXTAAAAAAA 3174 

3135 

3175 ACAAAGTTCAGTTTTAAGATTTGTTATTGACTTATTGTCATTTGAAAAATATAGTATGATATTAATATA 3243 
324 4 6TTTTATTTAIATAAT6CTTGTCTATTCAAGAITTGAGAACATTAATATGATACT0TCCACATATCCAA 3312 



NdaZ 

3313 TATATTAA6TTTCATTTCIGTTCAAACATAT6ATAAGAIG6TCAAATGATTATGAGTTXTGTTATTTAC 3381 

3341 

Taqi SauSAI 
Alul Rial 

3382 CTGAAGAAAAGATAAGTGAGCTTCGAGTTTCTGAASGGTACGTGATCTTCATTTCTTQGCTAAAAGCGA 3450 

3402 3421 
3405 3425 

3451 ATATGACATCACCTAGAGAAAGCCGATAATAGTAAACTCTGTTCTTGOTTTTTGQTTTAAICAAACCGA 3519 
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M»pl 

M8pl Odel Hpall 
BpaZI Alul M9X BiA£l 

III * I II 

3520 ACCGGTAGCTGAGTGTCAAGTCAQCAAXCATCGCAAACCAIATGTGUkTTGGTTAGATTC^CSGTTTM 359.8 

3522 3528 3560 357$ 

3522 352d 3581 

35^1 

Hpall 

I 

3 SB 9 GTTGTAAACCGGTATTTCATTTGGTGAAAACCCTAQAAQCCAGCCMICC7TT7TAA'eCTAATTT7TGCA 3$57 

3598 
3598 

HlnCl 
Hindi 
D4el BstNI 

I , II I 

3638 AACGAGAAGTCACCACACCTC7CCACTAAAACGCTGAACCTTAC7GAGAGAAGCAGAGKCANKAAAGAA 3726 

3702 3718 
3715 
3714 

3727 CAAATAAAACCCGAAGATGAGACCACCACGTGCGGCGGGACGTTCA8GGGACGGGGAGGAAGASAATGR 3795 



Avail 

Alul Avail 

II I 
3796 CGGCGGSMNTTTGGTGGCGGCGGCGGACGTTTTQQTGGCGGCGGTGQACGTtTTGQTGGCGGCGGTGGjl 3864 

3804 3863 
3801 

EcoRV Avail Ddel 

I I I 

3865 CCTTTGGTGGTGGATATC6TGACGAAG6ACCTCCCAG7GAAGTCATTGGTTCGT7TACTCTTTTCTTAG 3933 

3880 3892 3930 
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Hindlll 

II i I I 

3 934 TCGAATCTTATTCTTSCTCTGCrCSTTGTTTTACCGATAAAGCTTAAGACTTTATTGATAAAGTTCTCX .4002 

3937 3976 4000 

3935 3974 

Alul Xitml Hlnfi Dd«I 

I I II 

4 C C 3 GCTTTGAATGTGAATGAACIGTTTCCTGCTIATTASTCTTCCTMGTtTTffllSOT GAXTCACTGTCTTA 4071 

4004 4023 4059 4069 

Hinfl 

4072 GCACTTTTGTTAGATTCATCTTT6TGTTTAAGTTAAAAGGTAGAAACTTTCTCACTTGTCTCCGTTATG 4140 

4085 

Hindi 
I 

4141 ACAAGGTTAACTTTGTTGGTTATAACAGAAGTTGCGACCTTTCTCCATGCTTGTGAGG6TGATGCTGTG 4209 
4146 

Avail Alul Ddel S«u3AI 
I I I I 

4210 GACCAAGCTCTCTCAGGCCAAGATCCCTTACWCAATGCCCCAATCTACItG$AAAACAAOACACAGAT 42T8 

4210 4217 4222 4231 

Taqi 
S&ll 
PstZ 

Hlndlll Blnoll 
8au3AI Alul AOOI EeORI 

4279 TG6GAAAGTTGATGA6ATCCAAGCTTGGGCTGCAG6TC0AC6AATTC 432S 

4294 4302 4316 4321 

4300 4314 
4313 
431S 
4316 
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EAd-14 Linear 



LENGTH - 416 



CCAACCCSi 



ScrPI 

Maalll 
Hpall 

C«ull Mall Kpiill Nsp 

Li I . . ■ , I. I 



7 21 59 71 

7 



MmeX . iQjoi 

Mbol ^ttitl 
Dpnl GdiXZ 

BfRil Bcli Maaiii . cftt stmz 

I III Mil 

72 GGAGAATOCACAACTCAT.CTlGATCACGGGGTAXCTSCGGTTGGAtACGSCCSAfCtAAAWVCGGATTAAA If: 
82 93 102 129 126 

9S 12^3 
93 122 
93 124 

Xhezz iiiAZv 
Seal Niazv Nltiii 

Hboz rmi Ml&xil 

R««z opnx EooRZ AvaZZ MhlZ' Hboi 

BlnZ BunHZ BlnZ HnlZ Asuz Kji«ZZ bpnZ BinZ Ceo 

.1111 I I I I II II 1 I I I I 

143 STACTGGATCCTCAAGAATTCATGGGGACCAAAATGGOGAGAACSTGGATACATCAGGATCAAAAAAGATA 2(3 
144 149 157 163 169 186 : 202 208 213 

145 191 159 189 190 200 

149 167 198 

14b 151 167 
149 170 

Nlazzz xiiii 
I«mX 

214 TCAAGCCTAAACACGGACAATGTGGTCTTGCCATGAATGCITCGtAGCCAACtAfGtS^^ 2?<; 

255 aesr- 
249 259 

HlndZZZ N8p(7524)Z 

BpmZZ AluZ Bi.h£Z kUzzX 

III 11 
285 AATATCCGGTTAAGCTTTAGaaia^TGTGTGtGTTGGTtATAATtTAACactGTGiTGCATSTAATITG 35 3 

291 299 33S 346 

297 348 ~ 
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Partial aaquenoa of a* canfeaiieia omk £A9 i|ti£$l«iant 
to idantify a ganomie eloS* I^StalHng * ftoiBOtir. tti« polity 
adanylation aignal AATAAA ia ^odarlinat aft Hf ft th* pit^A ^aila^ 
Tha Btop eodon iha prafttmad opan mtlitt ii tobla ^darUned 
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